Harm Reduction Development and Validation

Author

Zach Budesa

Item Generation

Item generation was undertaken in several stages. The first step in item generation consisted of items generated in two scales (harm reduction principles and harm reduction strategies) which were drafted by the authors and piloted with harm reduction workers in local training and harm reduction supply distribution evaluation. Following the initial pilot project, we revised and added items to address gaps that had been noted. The items were generated with regard to both the SAMHSA Harm Reduction Framework and the National Harm Reduction Coalition’s Harm Reduction Principles(https://harmreduction.org/about-us/principles-of-harm-reduction/). Once the initial item pool was generated, we consulted with Dr. David Frank, a harm reduction expert and researcher. Based on his consultation, items were further modified to be accessible to a general population.

We used Wolf et al’s (2023) Response Process Evaluation (RPE) method to ensure that a non-expert, non-harm reduction audience would understand the item content. Over X rounds of data collection, we used Prolific to collect qualitative responses to each item. As a group, we evaluated each set of responses to determine whether respondents understood or did not understand the intended meaning of each item. Based on respondent answers and suggestions, we identified items which items were generally understood, and which items should be modified or removed. The final item pool contained 45 items related to harm reduction strategies and principles. We phrased items as simply and directly as possible, occasionally splitting complex topics (experiences of discrimination, other topic) across multiple items when necessary. For study 1, items were answered on a 6-point ordinal scale from 1 (Strongly Disagree) to 6 (Strongly Agree). Following inspection of item responses, studies 2 and 3 used 5-point ordinal scales from 1 (Strongly Disagree) to 5 (Strongly Agree). Participants answered items in a random order, and completed a brief demographic survey.

Items

Code
library(gt)
library(tidyverse)

items <- readr::read_csv("references/codebooks/20231011_hr-scale-codebook.csv")

items <- items %>% 
  filter(
    grepl("q", var_label)) %>% 
  mutate(item = 
           str_remove_all(item, pattern = '\r\n')) %>% 
  select(-1)
  
items %>% 
  select(-var_label) %>% 
  gt() %>% 
  tab_header(
    title = "Table 1: Initial Item Set"
  ) %>% 
  cols_label(
    item = "Item"
  )
Table 1: Initial Item Set
Item
People who inject drugs should be able to do so in a way that prevents them from causing further harm to their health.
People should have access to tools for safer sex (like condoms, STD tests).
People who use drugs should have access to naloxone/NARCAN.
The general public should have access to naloxone/NARCAN.
Police officers should have access to naloxone/NARCAN.
People who use drugs should have access to tools to test what's in their drugs.
People who use drugs should have access to safe injection supplies (sterile needles and syringes).
People who use drugs should have access to safe inhalation supplies (glass stems and pipes).
People who actively use drugs should have access to therapy/counseling.
People who return to using drugs after a period of abstinence should be allowed to continue in treatment.
Sobriety should be required for treatment.
Medications used to treat addiction (buprenorphine, naltrexone, or methadone) are an appropriate treatment option for people who use drugs.
Possession of "drug paraphernalia", like syringes and pipes, should be legal.
People who seek medical assistance for overdoses should be protected from drug charges, arrests, and prosecutions.
Sobriety should not be a requirement to access public housing.
Possession of all drugs should be decriminalized (possession would not lead to legal repercussions).
It should be legal for adults to purchase drugs from a dispensary/shop.
People who use drugs should have access to supervised places where they can consume drugs safely.
People who use drugs should have access to a legal, non-contaminated drug supply.
People use drugs to escape.
People should be able to use drugs safely.
People who use drugs should be treated with respect.
Poverty affects the health of people who use drugs.
Racism affects the health of people who use drugs.
Gender-based discrimination affects the health of people who use drugs.
Some ways of using drugs are safer than others.
People who use drugs deserve to live good lives.
Reducing drug use is a reasonable goal for people who use drugs.
Some people who use drugs cannot be expected to quit immediately.
People who use drugs should be involved in creating the programs and policies that serve them.
People in recovery from drug use should be involved in creating the programs and policies that serve them.
Relapse may be a part of the recovery process.
It is possible to live a healthy life without stopping drug use.
People who use drugs should be forced into treatment.
Using drugs is immoral.
Drug use has benefits.
Harm reduction complements traditional addiction prevention, treatment, and recovery services.
People who use drugs benefit society.
Reducing the negative consequences of drug use encourages more people to use drugs.
People will use more drugs if it is safer.
People who use drugs will naturally end up homeless.
Drugs make the world worse.
Drug use will always be part of society.
Chaotic drug use is a rational response to experiences like trauma, homelessness, hunger, and poverty.
People who use drugs should be able to use medications used to treat addiction (buprenorphine, naltrexone, or methadone) for any length of time.

Study 1

Study 1 was preregistered, which can be accessed at https://osf.io/yt7sm. In this study, we conducted exploratory factor analysis using an oblimin rotation based on a polychoric correlation matrix using maximum likelihood estimation for factor extraction. We used the 40-30-20 rule to determine factor loading cut off, which means that retained items should load on their primary factor above 0.4, load onto alternative factors below 0.3, and show a difference of 0.2 between primary and alternative factor loadings. Finally, scale reliability will be assessed using Cronbachs’ Alpha, the Greatest Lower Bound, and average split half reliability.

Participants

Code
# Libraries
library(psych)
library(correlation)
library(EFA.dimensions)
library(GPArotation)

# Import Data
df <- read.csv("data/clean/20231011_hr-scale-exploratory-data.csv")

# Build Demographic Data Set
demos <- df %>% 
  select(age, gid1, race, ed, sud_hx)

# Build Data Set of Items
hr <- df %>% # N = 301
  select(starts_with("q"), -Q_RecaptchaScore)

Through Prolific, we recruited 301 participants for Study 1. These participants were 50% men (n = 149), 70% White (n = 210), 41% had completed a bachelor’s degree (n = 122), 88% reported no history of a substance use disorder (SUD) diagnosis, and had an average age of 36. Table 2 provides additional detail about this sample.

Code
library(gtsummary)

demos %>% 
  tbl_summary(
    missing = "no",
    statistic = list(age = "{mean} ({sd})"),
    digits = list(age ~ c(2,2)),
    sort = list(everything() ~ "frequency"),
    label = list(
      age ~ "Age", gid1 ~ "Gender Identity",
      race ~ "Race/Ethnicity", ed ~ "Highest Level of Education",
      sud_hx ~ "History of Substance Use Disorder Diagnosis")
              ) %>% 
  as_gt() %>% 
  tab_header(title = "Table 2: Study 1 Participants")
Table 2: Study 1 Participants
Characteristic N = 3011
Age 35.92 (11.28)
Gender Identity
    Man 149 (50%)
    Woman 143 (48%)
    Non-Binary or Gender Fluid 7 (2.3%)
    Prefer not to say 1 (0.3%)
Race/Ethnicity
    White 210 (70%)
    Black or African American 30 (10.0%)
    Asian or Asian American 28 (9.3%)
    Multiracial 13 (4.3%)
    Latine or Hispanic 12 (4.0%)
    Prefer not to say 4 (1.3%)
    Indigenous, Aboriginal, or First Nations 2 (0.7%)
    Prefer to self-describe 2 (0.7%)
Highest Level of Education
    Bachelor’s Degree 122 (41%)
    High School 97 (32%)
    Master’s Degree 39 (13%)
    Associate’s Degree 28 (9.3%)
    Doctorate Degree (PhD, PsyD, etc.) 4 (1.3%)
    Professional Degree (J.D., M.D., etc.) 4 (1.3%)
    Less than High School 3 (1.0%)
    Other option not represented here 2 (0.7%)
    Prefer not to say 1 (0.3%)
History of Substance Use Disorder Diagnosis
    No 265 (88%)
    Yes 28 (9.3%)
    Prefer not to say 5 (1.7%)
    I don’t know 3 (1.0%)
1 Mean (SD); n (%)

Initial Data Inspection

Code
bart <- cortest.bartlett(hr)

kmo <- KMO(hr)

Using a polychoric correlation matrix, we analyzed Bartlett’s Sphericity and the Kaiser-Meyer-Olkin test to ensure that the data were appropriate for factor analysis. Our data appeared adequate for factor analysis, with Kaiser Meyer-Olkin = 0.937083, and Barlett’s sphericity \(\chi^2\)(990) = 8275.1008965, p = 0. Inspection of the correlation matrix showed many inter-item correlations > .40, indicating that a factor solution should be possible.

Local Dependence Between Items

Code
# Saved because LOCALDEP takes forever to run
# ld <- LOCALDEP(hr)
# saveRDS(ld, "study1-local-dependence.rds")

ld <- readRDS("study1-local-dependence.rds")

Initial inspection did reveal excessively high inter-item correlations that caused concern. First, there were 14 inter-item correlations which were > .70, and 2 partial correlations > .60. Using the EFA.dimensions package, inspection of possible local dependence was done via Q3, X2, and and G2 statistics. Possible locally dependent items were inspected pair-by-pair. Those that were highly similar were removed, while those that appeared statistically locally dependent but different content wise were retained. Table 3 provides additional detail about possible local dependence. Based on both quantitative and qualitative inspection of items, the following items were removed due to their similarity with others: Items 4 and 5 (local dependence with item 3), item 8 (local dependence with item 7), item 45 (local dependence with item 12), item 22 (local dependence with item 27), items 24 and 25 (local dependence with item 23), item 32 (local dependence with item 29), item 31 (local dependence with item 30), and item 40 (local dependence with item 39). While there are other possible removals, these items represented the most overlap in both statistical and content terms.

Code
ld$localdep_stats %>% 
  # Remove additional statistic
  select(-JSI) %>% 
  # Filter out unconcerning items
  filter(
    abs(Q3) > .34 &
     ( p_X2 < .05 |
      p_G2 < .05)
  ) %>% 
  gt(groupname_col = "Item_A", 
     rowname_col = "Item_B") %>% 
  tab_spanner(label = "Correlations", columns = c(3,4)) %>% 
  tab_spanner(label = "X2 Statistics", columns = c(6,7)) %>% 
  tab_spanner(label = "G2 Statistics", columns = c(8,9)) %>% 
  tab_header(title = "Table 3: Inspection of Local Dependence between Similar Items")
Table 3: Inspection of Local Dependence between Similar Items
Correlations Q3 X2 Statistics G2 Statistics
r partial_r X2 p_X2 G2 p_G2
q3
q4 0.829 0.705 0.688 117.534 0.000 131.412 0.000
q5 0.664 0.531 0.517 84.571 0.000 94.799 0.000
q4
q5 0.692 0.576 0.570 96.480 0.000 109.867 0.000
q7
q8 0.852 0.546 0.463 54.659 0.001 70.158 0.000
q12
q45 0.576 0.392 0.391 51.460 0.001 56.087 0.000
q22
q27 0.658 0.409 0.382 49.280 0.003 55.253 0.000
q23
q24 0.497 0.417 0.421 83.618 0.000 79.173 0.000
q24
q25 0.706 0.632 0.631 160.541 0.000 149.798 0.000
q29
q32 0.505 0.365 0.367 72.966 0.000 50.120 0.002
q30
q31 0.732 0.582 0.575 148.996 0.000 152.797 0.000
q33
q36 0.631 0.488 0.489 101.703 0.000 104.794 0.000
q38 0.614 0.440 0.443 88.980 0.000 96.442 0.000
q42 -0.582 -0.417 -0.424 68.425 0.000 71.765 0.000
q35
q41 0.615 0.362 0.347 55.910 0.000 62.287 0.000
q42 0.582 0.338 0.347 43.785 0.011 53.620 0.001
q36
q38 0.628 0.453 0.456 64.551 0.000 75.351 0.000
q42 -0.648 -0.504 -0.510 88.178 0.000 91.166 0.000
q38
q42 -0.660 -0.497 -0.501 87.129 0.000 96.960 0.000
q39
q40 0.694 0.587 0.583 104.985 0.000 113.625 0.000
Code
hr1 <- hr %>% select(-c(q4, q5, q8, q22, q24, q25, q32, q31, q40, q45))

bart <- cortest.bartlett(hr1)

kmo <- KMO(hr1)

After removing items which showed excessive local dependence, bartlett’s sphericity and Kaiser-Meyer-Olkin were checked again, with Kaiser Meyer-Olkin = 0.9489385, and Barlett’s sphericity \(\chi^2\)(595) = 5557.3157536, p = 0.

Exploratory Factor Analysis

Parallel Analysis

Code
fa.para <- fa.parallel(hr1,
                       fm = "ml", 
                       cor = "poly")

Parallel analysis suggests that the number of factors =  4  and the number of components =  2 

Model 1

Code
# Initial Factor Analysis
efa <- EFA.dimensions::EFA(hr1,
                    extraction = "ml", 
                    Nfactors = 4,
                    corkind = "polychoric",
                    rotation = "oblimin",
                    verbose = FALSE)



initial <- efa$pattern %>% 
  as_tibble(rownames = "q") %>% 
  left_join(items,
            by = c("q" = "var_label")) %>% 
  select(q, item, 2:5) %>% 
  gt() %>% 
  tab_style(
    locations = list(
      # Factor 1
      cells_body(
      columns = `Factor 1`,
      rows = abs(`Factor 1`) < .30 ),
      # Factor 2
      cells_body(
        columns = `Factor 2`,
        rows = abs(`Factor 2`) < .30 
      ),
      # Factor 3
       cells_body(
        columns = `Factor 3`,
        rows = abs(`Factor 3`) < .30 
      ),
      # Factor 4
       cells_body(
        columns = `Factor 4`,
        rows = abs(`Factor 4`) < .30 
      )),
    style = list(cell_text(color = 'gray'))) %>% 
  fmt_number(
    columns = starts_with("Factor"),
    decimals = 3) 

An initial factor analysis revealed that a four factor solution provided poor fit for the data, with factor 1 explaining 41% of the total variance (eigenvalue = 14.57), factor 2 explaining 8% of the total variance (eigenvalue = 2.8), factor 3 explaining 3% of the total variance (eigenvalue = 1.05), and factor 4 explaining 2.5% of the total variance (eigenvalue = .90). This model had the following model fit indices: \(\chi^2\)(461) = 1007.56, p = 8.4970109^{-43}; CLI = 0.911; TLI = 0.885; RMSEA = 0.066. In addition to poor fit, there were numerous cross-loadings which exceeded the limit of difference of .2. Table 4 provides full factor loadings.

Code
# New Matrix
hr2 <- hr1 %>% 
  select(
    -c(q17, q12, q43, q15, q35, q37, q26, q34, q44, q20, q11, q28, q14, q27, q30, q3, q13))


efa.final <- EFA.dimensions::EFA(hr2,
                    extraction = "ml", 
                    Nfactors = 3, 
                    Ncases = 301,
                    rotation = "oblimin",
                    verbose = FALSE)

final <- efa.final$pattern %>% 
  as_tibble(rownames = "q") %>% 
  left_join(items,
            by = c("q" = "var_label")) %>% 
  select(q, item, 2:5) %>%  
  gt() %>% 
  tab_style(
    locations = list(
      # Factor 1
      cells_body(
      columns = `Factor 1`,
      rows = abs(`Factor 1`) < .30),
      # Factor 2
      cells_body(
        columns = `Factor 2`,
        rows = abs(`Factor 2`) < .30 
      ),
      # Factor 3
       cells_body(
        columns = `Factor 3`,
        rows = abs(`Factor 3`) < .30 
      )),
    style = list(cell_text(color = 'gray'))) %>% 
  fmt_number(
    columns = starts_with("Factor"),
    decimals = 3) 


left_join(
    initial$`_data`,
    final$`_data`,
    by = c(
      "q" = "q"
    )) %>% 
  arrange(
    factor(q,
           levels = 
             c(
               # Factor 1
               "q1", "q18", "q7", "q19", "q21", "q6", "q16", "q39",
               # Factor 2
               "q42", "q36", "q38", "q33", "q41",
               # Factor 3
               "q9", "q2", "q23", "q10", "q29"))) %>% 
  janitor::clean_names() %>% 
  select(-q, -item_y) %>% 
  gt() %>% 
  tab_style(
    locations = list(
      # Factor 1
      cells_body(
      columns = factor_1_x,
      rows = abs(factor_1_x) < .30),
      # Factor 2
      cells_body(
        columns = factor_2_x,
        rows = abs(factor_2_x) < .30 
      ),
      # Factor 3
       cells_body(
        columns = factor_3_x,
        rows = abs(factor_3_x) < .30 
      ),
      # Factor 4
       cells_body(
        columns = factor_4,
        rows = abs(factor_4) < .30 
      ),
       # Factor 1 - Final
      cells_body(
      columns = factor_1_y,
      rows = abs(factor_1_y) < .30),
      # Factor 2 - Final
      cells_body(
        columns = factor_2_y,
        rows = abs(factor_2_y) < .30 
      ),
      # Factor 3 - Final
       cells_body(
        columns = factor_3_y,
        rows = abs(factor_3_y) < .30 
      )),
    style = list(cell_text(color = 'gray'))) %>% 
  fmt_number(
    columns = starts_with("Factor"),
    decimals = 3)  %>% 
  fmt_missing() %>% 
  tab_header(
    title = "Table 4: Initial and Final Factor Structure"
  ) %>% 
  tab_spanner(
    label = "Initial 4 Factor Solution",
    columns = 2:5
  ) %>% 
  tab_spanner(
    label = "Final 3 Factor Solution",
    columns = 6:8
  ) %>% 
  cols_label(
    item_x = "Item", 
    factor_1_x  = "Factor 1",
    factor_2_x = "Factor 2",
    factor_3_x = "Factor 3",
    factor_4 = "Factor 4",
    factor_1_y = "Factor 1",
    factor_2_y = "Factor 2",
    factor_3_y = "Factor 3"
  )

Model 2

Based on initial factor loadings, items were iteratively removed and the structure was reduced from 4 factors to 3. In total, 13 items were removed (see Table 4), which resulted in a 22 item, 3 factor structure with adequate factor loadings and cross-loadings. This model had the following model fit indices: \(\chi^2\)(102) = 139.19, p = 0.009; CLI = 0.985; TLI = 0.978; RMSEA = 0.036. In this solution, factor 1 explained 47% of the total variance (eigenvalue = 10.45), factor 2 explained 12% of the total variance (eigenvalue = 2.74), and factor 3 explained 5% of the total variance (eigenvalue = 1.09). This model shows acceptable fit, and was chosen as the final solution for this study.

Scale Inspection

The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it’s greatest lower bound is very low. Finally, the third factor has split half reliability, internal consistency, and greatest lower bound between .7 and .8, which indicates acceptable, if low reliability. Based on these statistics, the first factor provides the best measurement of the construct of interest, while factors 2 and 3 present some concern.

Code
scale1 <- df %>% 
  select(q1, q18, q7, q19, q21, q6, q16, q39)
scale2 <- df %>% 
  select(q42, q36, q38, q33, q41)
scale3 <- df %>% 
  select(q9, q2, q23, q10, q29)


splithalf1 <- splitHalf(scale1) ; splithalf2 <- splitHalf(scale2) ; splithalf3 <- splitHalf(scale3) 
alpha1 <- psych::alpha(scale1, check.keys = TRUE) ; alpha2 <- psych::alpha(scale2, check.keys = TRUE) ; alpha3 <- psych::alpha(scale3, check.keys = TRUE) 
glb1 <- glb.algebraic(cor(scale1, use = "pairwise.complete.obs")) ; glb2 <- glb.algebraic(cor(scale2, use = "pairwise.complete.obs")) ; glb3 <- glb.algebraic(cor(scale3, use = "pairwise.complete.obs")) 

data.frame(Factor = c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2", "Factor 3", "Factor 3", "Factor 3"),
           `Reliability Statistic` = c("Average Split Half", "Alpha", "Greatest Lower Bound",
                                       "Average Split Half", "Alpha", "Greatest Lower Bound", 
                                       "Average Split Half", "Alpha", "Greatest Lower Bound"),
           Statistic = c(round(splithalf1$meanr,2),
                         round(alpha1$total$raw_alpha, 2), 
                         round(glb1$glb, 2),
                         round(splithalf2$meanr,2),
                         round(alpha2$total$raw_alpha, 2), 
                         round(glb2$glb, 2),
                         round(splithalf3$meanr,2),
                         round(alpha3$total$raw_alpha, 2), 
                         round(glb3$glb, 2))) %>% 
  gt::gt(groupname_col = "Factor") %>% 
  gt::tab_header(title = "Table 5: Reliability Statistics For All Scales") 
Table 5: Reliability Statistics For All Scales
Reliability.Statistic Statistic
Factor 1
Average Split Half 0.92
Alpha 0.92
Greatest Lower Bound 0.91
Factor 2
Average Split Half 0.83
Alpha 0.85
Greatest Lower Bound 0.41
Factor 3
Average Split Half 0.66
Alpha 0.68
Greatest Lower Bound 0.73

Once reliability for the final factor solution had been inspected, item responses were investigated. For factor 1, distribution of responses across response options showed that the lowest response options (1, “Strongly Disagree” & 2, “Disagree”) were underused for many items, except those which loaded negatively (i.e., reverse coded), like the final item in factor 1, in which the highest response option (6, “Strongly Agree”) was used only 7% of the time.

Code
item <- read.csv("references/docs/items.csv")


items.1 <-  
  data.frame(alpha1$response.freq) %>% 
  mutate(item = rownames(.)) %>% 
  left_join(item %>% 
              select(qname, question), 
            by = c("item" = "qname")) %>% 
  pivot_longer(c(X1:X6)) %>% 
  mutate(Response = 
           factor(case_when(
             name == "X1" ~ "Strongly Disagree",
             name == "X2" ~ "Disagree",
             name == "X3" ~ "Somewhat Disagree",
             name == "X4" ~ "Somewhat Agree",
             name == "X5" ~ "Agree",
             name == "X6" ~ "Strongly Agree"
           ),
           levels = c("Strongly Agree","Agree","Somewhat Agree",
                      "Somewhat Disagree","Disagree",
                      "Strongly Disagree"))) %>% 
  ggplot() +
  aes(x = question, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
    geom_text(aes(label = paste0(round(value*100,0),"%")), 
              position = position_stack(vjust = .5),
              size = 8, color = "white") +
  ggokabeito::scale_fill_okabe_ito(
      breaks = c("Strongly Disagree","Disagree","Somewhat Disagree",
                 "Somewhat Agree","Agree","Strongly Agree"
      )) +
    coord_flip() +
    ggforce::facet_col(facets = vars(question), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Factor 1") +
    theme(
      axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 15, vjust = -.2),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 30),
      legend.text = element_text(size = 10)
    )

## Scale 2
items.2 <- data.frame(alpha2$response.freq) %>% 
    mutate(item = rownames(.)) %>% 
  left_join(item %>% 
              select(qname, question), 
            by = c("item" = "qname")) %>% 
  pivot_longer(c(X1:X6)) %>% 
  mutate(Response = 
           factor(case_when(
             name == "X1" ~ "Strongly Disagree",
             name == "X2" ~ "Disagree",
             name == "X3" ~ "Somewhat Disagree",
             name == "X4" ~ "Somewhat Agree",
             name == "X5" ~ "Agree",
             name == "X6" ~ "Strongly Agree"
           ),
           levels = c("Strongly Agree","Agree","Somewhat Agree",
                      "Somewhat Disagree","Disagree",
                      "Strongly Disagree"))) %>% 
  ggplot() +
  aes(x = question, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
    geom_text(aes(label = paste0(round(value*100,0),"%")), 
              position = position_stack(vjust = .5),
              size = 8, color = "white") +
    ggokabeito::scale_fill_okabe_ito(
      breaks = c("Strongly Disagree","Disagree","Somewhat Disagree",
                 "Somewhat Agree","Agree","Strongly Agree"
      )) +
    coord_flip() +
    ggforce::facet_col(facets = vars(question), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Factor 2") +
    theme(
     axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 10),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 20)
    )



## Scale 3
items.3 <- data.frame(alpha3$response.freq) %>% 
mutate(item = rownames(.)) %>% 
  left_join(item %>% 
              select(qname, question), 
            by = c("item" = "qname")) %>% 
  pivot_longer(c(X1:X6)) %>% 
  mutate(Response = 
           factor(case_when(
             name == "X1" ~ "Strongly Disagree",
             name == "X2" ~ "Disagree",
             name == "X3" ~ "Somewhat Disagree",
             name == "X4" ~ "Somewhat Agree",
             name == "X5" ~ "Agree",
             name == "X6" ~ "Strongly Agree"
           ),
           levels = c("Strongly Agree","Agree","Somewhat Agree",
                      "Somewhat Disagree","Disagree",
                      "Strongly Disagree"
                 ))) %>% 
  ggplot() +
  aes(x = question, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
    geom_text(aes(label = paste0(round(value*100,0),"%")), 
              position = position_stack(vjust = .52),
              size = 3, color = "white") +
    ggokabeito::scale_fill_okabe_ito(
      breaks = c("Strongly Disagree","Disagree","Somewhat Disagree",
                 "Somewhat Agree","Agree","Strongly Agree"
      )) +
    coord_flip() +
    ggforce::facet_col(facets = vars(question), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Factor 3") +
    theme(
      axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 10),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 20)
    )

items.1

Figure 1: Study 1, Factor 1 Response Distribution

The same was true for both Factors 2 and 3, which showed that extreme response options were underutilized across most items (See Figures 2 & 3).

Code
items.2
items.3

Figure 2: Study 1, Factor 2 Response Distribution

Figure 3: Study 1, Factor 3 Response Distribution

As a result of this response option utilization, the decision was made to reduce from 6 options to 5 for future data collection efforts. The final factor solution was deemed acceptable for confirmatory studies.

Study 2

Study 2 was preregistered, which can be accessed at https://osf.io/t5e2a. In this study, we conducted confirmatory factor analysis using new data based on the factor structure in Study 1. The confirmatory factor analysis was fit using weighted least squares estimation treating categories as ordinal rather than numeric, which was a deviation from the preregistration. We will use Hu & Bentler’s (2009) cut offs of .95 for TLI and CFI, and .06 for RMSEA.

Participants

Through Prolific, we recruited 402 participants for Study 2. These participants were 49% men (n = 196), 71% White (n = 282), and most had completed a bachelor’s degree (37%, n = 150), and 90% (n = 362) reported no history of SUD diagnosis. Table 6 provides detail about the participants.

Code
df <- read.csv("data/clean/20231020_hr-scale-confirmatory-data.csv")

demos <- df %>% 
  select(age, gid1, race, ed, sud_hx)

demos %>% 
  tbl_summary(
    missing = "no",
    statistic = list(age = "{mean} ({sd})"),
    digits = list(age ~ c(2,2)),
    sort = list(everything() ~ "frequency"),
    label = list(
      age ~ "Age", gid1 ~ "Gender Identity",
      race ~ "Race/Ethnicity", ed ~ "Highest Level of Education",
      sud_hx ~ "History of Substance Use Disorder Diagnosis")
              ) %>% 
  as_gt() %>% 
  tab_header(title = "Table 6: Study 2 Participants")
Table 6: Study 2 Participants
Characteristic N = 4091
Age 36.45 (12.35)
Gender Identity
    Woman 196 (49%)
    Man 191 (48%)
    Non-Binary, Agender, or Other 13 (3.3%)
Race/Ethnicity
    White 282 (71%)
    Asian or Asian American 41 (10%)
    Black or African American 33 (8.3%)
    Latine or Hispanic 18 (4.5%)
    Multiracial 17 (4.3%)
    Other or Prefer not to say 8 (2.0%)
Highest Level of Education
    Bachelor’s Degree 150 (37%)
    High School 122 (30%)
    Master’s Degree 50 (12%)
    Associate’s Degree 44 (11%)
    Other option not represented here 11 (2.7%)
    Professional Degree (J.D., M.D., etc.) 11 (2.7%)
    Doctorate Degree (PhD, PsyD, etc.) 5 (1.2%)
    Less than High School 4 (1.0%)
    Prefer not to say 4 (1.0%)
History of Substance Use Disorder Diagnosis
    No 362 (90%)
    Yes 27 (6.7%)
    Prefer not to say 8 (2.0%)
    I don’t know 5 (1.2%)
1 Mean (SD); n (%)

Confirmator Factory Analysis

The model from Study 1 was fit using confirmatory factor analysis using a weighted least squares estimator and ordinal indicators. This model showed moderate fit based on the selected fit statistics: \(\chi^2\)(206) = 1033.11, p > 0.001; CFI = .97; TLI = .97; SRMR = .27; RMSEA = .10. Based on CFI and TLI, the model had good fit. Based on SRMR and RMSEA, the model shows poor fit. Figure 4 provides factor loadings and covariances for the model.

Code
library(lavaan)
library(semPlot)
library(semptools)

hr <- df %>% # N = 301
  select(starts_with("q"), -Q_RecaptchaScore)

# Initial Model Fit
model <-  'F1 =~ q1 + q18 + q7 + q19 + q21 + q6 + q16 + q39
           F2 =~ q42 + q36 + q38 + q33 + q41
           F3 =~ q9 + q2 + q23 + q10 + q29

           F1 ~~ F2
           F1 ~~ F3' 

fit <- cfa(
  model,
  data = df,
  ordered = TRUE,
  estimator = "WLS",
  std.lv = TRUE
)      


fitmeasures(fit,
            c("cfi", "tli", "srmr", "rmsea", "chisq", "df", "pvalue"),
            ci = TRUE) %>% 
  as_tibble(rownames = "Item") %>% 
  gt() %>% 
  tab_header(
    title = "Table 7: CFA Fit Statistics"
  ) %>% 
  fmt_number(columns = 2, 
             decimals = 2)
Table 7: CFA Fit Statistics
Item value
cfi 0.97
tli 0.97
srmr 0.17
rmsea 0.08
chisq 453.24
df 132.00
pvalue 0.00
Code
plot(set_cfa_layout(
  semPaths(fit, whatLabels="est",
        sizeMan = 3,
        node.width = 1,
        thresholds = FALSE,
        edge.label.cex = .75,
        style = "ram",
        mar = c(5, 1, 5, 1),
        intercepts = FALSE,
        DoNotPlot = TRUE),
  fcov_curve = 1.75,
  loading_position = .9))

Figure 4: Study 2 CFA

Scale Reliability

Reliability is an issue with this model, as the first factor shows continued adequate reliability, but factors 2 and 3 continue to have poorer reliability. Additionally, reliability would be improved by dropping items, which would shorten factor 2, and make it only 3 items, if not fewer.

Code
scale1 <- df %>% 
  select(q1, q18, q7, q19, q21, q6, q16, q39)
scale2 <- df %>% 
  select(q42, q36, q38, q33, q41)
scale3 <- df %>% 
  select(q9, q2, q23, q10, q29)


splithalf1 <- splitHalf(scale1) ; splithalf2 <- splitHalf(scale2) ; splithalf3 <- splitHalf(scale3) 
alpha1 <- psych::alpha(scale1, check.keys = TRUE) ; alpha2 <- psych::alpha(scale2, check.keys = TRUE) ; alpha3 <- psych::alpha(scale3, check.keys = TRUE) 
glb1 <- glb.algebraic(cor(scale1, use = "pairwise.complete.obs")) ; glb2 <- glb.algebraic(cor(scale2, use = "pairwise.complete.obs")) ; glb3 <- glb.algebraic(cor(scale3, use = "pairwise.complete.obs")) 

data.frame(Factor = c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2", "Factor 3", "Factor 3", "Factor 3"),
           `Reliability Statistic` = c("Average Split Half", "Alpha", "Greatest Lower Bound",
                                       "Average Split Half", "Alpha", "Greatest Lower Bound", 
                                       "Average Split Half", "Alpha", "Greatest Lower Bound"),
           Statistic = c(round(splithalf1$meanr,2),
                         round(alpha1$total$raw_alpha, 2), 
                         round(glb1$glb, 2),
                         round(splithalf2$meanr,2),
                         round(alpha2$total$raw_alpha, 2), 
                         round(glb2$glb, 2),
                         round(splithalf3$meanr,2),
                         round(alpha3$total$raw_alpha, 2), 
                         round(glb3$glb, 2))) %>% 
  gt::gt(groupname_col = "Factor") %>% 
  gt::tab_header(title = "Table 5: Reliability Statistics For Both Scales")
Table 5: Reliability Statistics For Both Scales
Reliability.Statistic Statistic
Factor 1
Average Split Half 0.91
Alpha 0.91
Greatest Lower Bound 0.91
Factor 2
Average Split Half 0.79
Alpha 0.81
Greatest Lower Bound 0.38
Factor 3
Average Split Half 0.68
Alpha 0.67
Greatest Lower Bound 0.73

Exploratory Factor Analysis

Code
hr <- df %>% # N = 301
  select(starts_with("q"), -Q_RecaptchaScore)

hr1 <-
  hr %>% 
  select(-c(q4:q5, q25, q31, q40, q45, q8))


bart <- cortest.bartlett(hr1)

kmo <- KMO(hr1)

Based on poor model fit, this same data set was used to conduct a second exploratory factor analysis using a maximum likelihood estimator and an oblimin rotation. This exploratory study was not preregistered. The items which showed initial local dependence were reinspected. In this study, items 8, 24, 22, and 32 showed acceptable levels of local dependence, and were retained. After removing items which showed excessive local dependence, bartlett’s sphericity and Kaiser-Meyer-Olkin were checked again, with Kaiser Meyer-Olkin = 0.9524033, and Barlett’s sphericity \(\chi^2\)(703) = 7757.0517211, p = 0.

Code
# Initial Factor Analysis
efa <- EFA.dimensions::EFA(hr1,
                    extraction = "ml", 
                    Nfactors = 4, 
                    corkind = "polychoric",
                    rotation = "oblimin",
                    verbose = FALSE)



initial <- efa$pattern %>% 
  as_tibble(rownames = "q") %>% 
  left_join(items,
            by = c("q" = "var_label")) %>% 
  select(q, item, 2:5) %>%  
  gt() %>% 
  tab_style(
    locations = list(
      # Factor 1
      cells_body(
      columns = `Factor 1`,
      rows = abs(`Factor 1`) < .30 ),
      # Factor 2
      cells_body(
        columns = `Factor 2`,
        rows = abs(`Factor 2`) < .30 
      ),
      # Factor 3
       cells_body(
        columns = `Factor 3`,
        rows = abs(`Factor 3`) < .30 
      ),
      # Factor 4
       cells_body(
        columns = `Factor 4`,
        rows = abs(`Factor 4`) < .30 
      )),
    style = list(cell_text(color = 'gray'))) %>% 
  fmt_number(
    columns = starts_with("Factor"),
    decimals = 3) 

Initial EFA Model

An initial factor analysis revealed that a four factor solution provided poor fit for the data, with factor 1 explaining 41% of the total variance (eigenvalue = 14.57), factor 2 explaining 8% of the total variance (eigenvalue = 2.8), factor 3 explaining 3% of the total variance (eigenvalue = 1.05), and factor 4 explaining 2.5% of the total variance (eigenvalue = .90). This model had the following model fit indices: \(\chi^2\)(557) = 1736.37, p = 2.9756724^{-121}; CLI = 0.873; TLI = 0.839; RMSEA = 0.076. In addition to poor fit, there were numerous cross-loadings which exceeded the limit of difference of .2. Table 6 provides full factor loadings.

After removing items with no primary loading or close cross loadings, a three factor solution provided the best fit. At this point, factor 3 consisted of 6 items which explained only 22% of the total variance and had 4 factor loadings less that .55. Between these quantitative reasons and inspecting the qualitative content of the items, they were removed. These were items 2, 9, 28, 29, and 32. The content of these items focused primarily on treatment aspects of harm reduction, where were less directly relevant to constructs of interest.

Code
hr2 <- df %>% 
  select(# Factor 1
        q7, q1, q6, q18, q19, q21, q24, q14, q13,
        # Factor 2
        q33, q42, q36, q35, q38, q34, q41)

efa.final <- EFA.dimensions::EFA(hr2,
                    extraction = "ml", 
                    Nfactors = 2, 
                    corkind = "polychoric",
                    rotation = "oblimin",
                    verbose = FALSE)

final <- efa.final$pattern %>% 
  as_tibble(rownames = "q") %>% 
  left_join(items,
            by = c("q" = "var_label")) %>% 
  select(q, item, 2:3) %>%  
  gt() %>% 
  tab_style(
    locations = list(
      # Factor 1
      cells_body(
      columns = `Factor 1`,
      rows = abs(`Factor 1`) < .30),
      # Factor 2
      cells_body(
        columns = `Factor 2`,
        rows = abs(`Factor 2`) < .30 
      )),
    style = list(cell_text(color = 'gray'))) %>% 
  fmt_number(
    columns = starts_with("Factor"),
    decimals = 3) 

Final EFA Model

The resulting model was a two factor model which consisted of 15 items. Factor 1 explained 59% of the variance, and Factor 2 explained 41% of the variance. This model had the following model fit indices: \(\chi^2\)(89) = 305.23, p = 1.7768727^{-25}; CLI = 0.952; TLI = 0.936; RMSEA = 0.079, which indicated acceptable fit. Due to the low loading in this study, item “q39” is a candidate for removal following the next round of data collection.

Code
left_join(
    initial$`_data`,
    final$`_data`,
    by = c(
      "q" = "q"
    )) %>% 
  arrange(
    factor(q,
           levels = 
             c(
               # Factor 1
              "q7", "q1", "q6", "q18", "q19", "q21", "q24", "q14", "q13",
               # Factor 2 
               "q33", "q42", "q36", "q35", "q38", "q34", "q41"))) %>% 
  janitor::clean_names() %>% 
  select(-q, -item_y) %>% 
  gt() %>% 
  tab_style(
    locations = list(
      # Factor 1
      cells_body(
      columns = factor_1_x,
      rows = abs(factor_1_x) < .30),
      # Factor 2
      cells_body(
        columns = factor_2_x,
        rows = abs(factor_2_x) < .30
      ),
      # Factor 3
       cells_body(
        columns = factor_3,
        rows = abs(factor_3) < .30
      ),
      # Factor 4
       cells_body(
        columns = factor_4,
        rows = abs(factor_4) < .30
      ),
       # Factor 1 - Final
      cells_body(
      columns = factor_1_y,
      rows = abs(factor_1_y) < .30),
      # Factor 2 - Final
      cells_body(
        columns = factor_2_y,
        rows = abs(factor_2_y) < .30
      )),
    style = list(cell_text(color = 'gray'))) %>%
  fmt_number(
    columns = starts_with("Factor"),
    decimals = 3)  %>%
  fmt_missing() %>%
  tab_header(
    title = "Table 8: Initial and Final Factor Structure for 2 factors"
  ) %>%
  tab_spanner(
    label = "Initial 4 Factor Solution",
    columns = 2:5
  ) %>%
  tab_spanner(
    label = "Final 2 Factor Solution",
    columns = 6:7
  ) %>%
  cols_label(
    item_x = "Item",
    factor_1_x  = "Factor 1",
    factor_2_x = "Factor 2",
    factor_3 = "Factor 3",
    factor_4 = "Factor 4",
    factor_1_y = "Factor 1",
    factor_2_y = "Factor 2"
  )
Table 8: Initial and Final Factor Structure for 2 factors
Item Initial 4 Factor Solution Final 2 Factor Solution
Factor 1 Factor 2 Factor 3 Factor 4 Factor 1 Factor 2
People who use drugs should have access to safe injection supplies (sterile needles and syringes). 0.871 0.107 0.072 −0.072 0.988 0.087
People who inject drugs should be able to do so in a way that prevents them from causing further harm to their health. 0.848 0.073 −0.033 0.007 0.874 −0.044
People who use drugs should have access to tools to test what's in their drugs. 0.752 0.191 0.029 0.059 0.834 0.055
People who use drugs should have access to supervised places where they can consume drugs safely. 0.803 −0.048 −0.090 −0.040 0.704 −0.199
People who use drugs should have access to a legal, non-contaminated drug supply. 0.757 −0.043 −0.133 −0.037 0.691 −0.211
People should be able to use drugs safely. 0.750 0.021 −0.293 0.079 0.612 −0.380
Racism affects the health of people who use drugs. 0.588 0.167 0.182 −0.126 0.750 0.184
People who seek medical assistance for overdoses should be protected from drug charges, arrests, and prosecutions. 0.539 0.108 −0.023 −0.241 0.741 0.019
Possession of "drug paraphernalia", like syringes and pipes, should be legal. 0.575 0.045 −0.231 −0.038 0.591 −0.225
It is possible to live a healthy life without stopping drug use. 0.281 −0.205 −0.536 −0.058 0.030 −0.697
Drugs make the world worse. −0.185 0.078 0.712 −0.015 0.048 0.853
Drug use has benefits. 0.245 −0.021 −0.656 0.189 −0.048 −0.784
Using drugs is immoral. −0.156 −0.042 0.587 0.212 −0.212 0.610
People who use drugs benefit society. 0.465 −0.167 −0.386 −0.173 0.321 −0.519
People who use drugs should be forced into treatment. 0.039 −0.087 0.549 0.278 −0.108 0.511
People who use drugs will naturally end up homeless. 0.013 −0.209 0.534 0.229 −0.193 0.440
People should have access to tools for safer sex (like condoms, STD tests). 0.142 0.761 −0.082 −0.027
People who use drugs should have access to naloxone/NARCAN. 0.356 0.346 0.075 −0.258
People who actively use drugs should have access to therapy/counseling. 0.108 0.668 0.091 −0.171
People who return to using drugs after a period of abstinence should be allowed to continue in treatment. 0.143 0.326 0.011 −0.390
Sobriety should be required for treatment. −0.267 0.001 0.028 0.361
Medications used to treat addiction (buprenorphine, naltrexone, or methadone) are an appropriate treatment option for people who use drugs. 0.204 0.292 −0.044 −0.202
Sobriety should not be a requirement to access public housing. 0.416 −0.104 −0.204 −0.338
Possession of all drugs should be decriminalized (possession would not lead to legal repercussions). 0.697 −0.174 −0.091 −0.121
It should be legal for adults to purchase drugs from a dispensary/shop. 0.447 0.089 −0.476 0.106
People use drugs to escape. 0.107 0.521 0.263 0.221
People who use drugs should be treated with respect. 0.322 0.122 −0.044 −0.483
Poverty affects the health of people who use drugs. 0.314 0.355 0.217 −0.102
Some ways of using drugs are safer than others. 0.407 0.274 −0.320 0.097
People who use drugs deserve to live good lives. 0.390 0.231 −0.090 −0.340
Reducing drug use is a reasonable goal for people who use drugs. 0.181 0.467 0.089 0.066
Some people who use drugs cannot be expected to quit immediately. 0.078 0.406 −0.203 −0.220
People who use drugs should be involved in creating the programs and policies that serve them. 0.238 0.219 −0.098 −0.310
Relapse may be a part of the recovery process. 0.011 0.417 −0.005 −0.386
Harm reduction complements traditional addiction prevention, treatment, and recovery services. 0.552 0.223 0.130 −0.099
Reducing the negative consequences of drug use encourages more people to use drugs. −0.413 0.055 −0.038 0.319
Drug use will always be part of society. −0.190 0.544 −0.494 −0.028
Chaotic drug use is a rational response to experiences like trauma, homelessness, hunger, and poverty. 0.272 0.046 −0.049 −0.187

The resulting scale consists of the following items:

Code
final.items <- hr %>% 
      select(
        # Factor 1
        q7, q1, q6, q18, q19, q21, q24, q14, q13,
        # Factor 2
        q33, q42, q36, q35, q38, q34, q41)


items <- readr::read_csv("references/codebooks/20231025_hr-scale-codebook.csv")

items %>% 
  filter(
    grepl("q", var_label)
    ) %>% 
  filter(var_label %in% 
           c(
             # Factor 1
              "q7", "q1", "q6", "q18", "q19", "q21", "q24", "q14", "q13",
               # Factor 2 
               "q33", "q42", "q36", "q35", "q38", "q34", "q41"
           )) %>% 
  arrange(
      factor(
    var_label,
    levels = c(# Factor 1
      # Factor 1
              "q7", "q1", "q6", "q18", "q19", "q21", "q24", "q14", "q13",
               # Factor 2 
               "q33", "q42", "q36", "q35", "q38", "q34", "q41"  
      ))) %>% 
  mutate(
    item = 
      case_when(
        var_label %in% c("q42", "q35", "q34", "q41", "q39") ~ paste0(item, "*"),
        TRUE ~ item
      ),
    Factor = c(rep("Factor 1", 9), rep("Factor 2", 7))
  ) %>% 
    select(Factor, item) %>%
  gt(groupname_col = "Factor") %>% 
  tab_header(
    title = "Table 9: Final Items"
  ) %>% 
  cols_label(
    item = ""
  ) %>% 
  tab_footnote("* indicates reverse coded items")
Table 9: Final Items
Factor 1
People who use drugs should have access to safe injection supplies (sterile needles and syringes).
People who inject drugs should be able to do so in a way that prevents them from causing further harm to their health.
People who use drugs should have access to tools to test what's in their drugs.
People who use drugs should have access to supervised places where they can consume drugs safely.
People who use drugs should have access to a legal, non-contaminated drug supply.
People should be able to use drugs safely.
Racism affects the health of people who use drugs.
People who seek medical assistance for overdoses should be protected from drug charges, arrests, and prosecutions.
Possession of "drug paraphernalia", like syringes and pipes, should be legal.
Factor 2
It is possible to live a healthy life without stopping drug use.
Drugs make the world worse. *
Drug use has benefits.
Using drugs is immoral. *
People who use drugs benefit society.
People who use drugs should be forced into treatment. *
People who use drugs will naturally end up homeless. *
* indicates reverse coded items

Scale Inspection

The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it’s greatest lower bound is very low, which continues to be a concern. Overall, between the number of items, EFA fit statistics, and reliability, this model provides better fit and output compared to the initial model.

Code
scale1 <- df %>% 
  select(q7, q1, q6, q24, q14, q18, q19, q13, q21)
scale2 <- df %>% 
  select(q42, q36, q33, q35, q38, q34, q41)


splithalf1 <- splitHalf(scale1) ; splithalf2 <- splitHalf(scale2)  
alpha1 <- psych::alpha(scale1, check.keys = TRUE) ; alpha2 <- psych::alpha(scale2, check.keys = TRUE) 
glb1 <- glb.algebraic(cor(scale1, use = "pairwise.complete.obs")) ; glb2 <- glb.algebraic(cor(scale2, use = "pairwise.complete.obs")) 

data.frame(Factor = c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2"),
           `Reliability Statistic` = c("Average Split Half", "Alpha", "Greatest Lower Bound",
                                       "Average Split Half", "Alpha", "Greatest Lower Bound"),
           Statistic = c(round(splithalf1$meanr,2),
                         round(alpha1$total$raw_alpha, 2), 
                         round(glb1$glb, 2),
                         round(splithalf2$meanr,2),
                         round(alpha2$total$raw_alpha, 2), 
                         round(glb2$glb, 2))) %>% 
  gt::gt(groupname_col = "Factor") %>% 
  gt::tab_header(title = "Table 10: Reliability Statistics For All Scales")
Table 10: Reliability Statistics For All Scales
Reliability.Statistic Statistic
Factor 1
Average Split Half 0.92
Alpha 0.92
Greatest Lower Bound 0.94
Factor 2
Average Split Half 0.86
Alpha 0.85
Greatest Lower Bound 0.47

Item Response Distribution

Both factors show a similar distribution to in the initial study. Some more extreme response options show under use, but across all items response options are used.

Code
data.frame(alpha1$response.freq) %>% 
  mutate(item = rownames(.)) %>% 
  left_join(item %>% 
              select(qname, question), 
            by = c("item" = "qname")) %>% 
  pivot_longer(c(X1:X5)) %>% 
  mutate(Response = 
           factor(case_when(
             name == "X1" ~ "Strongly disagree",
             name == "X2" ~ "Somewhat disagree",
             name == "X3" ~ "Neither agree nor disagree",
             name == "X4" ~ "Somewhat agree",
             name == "X5" ~ "Strongly agree"
           ),
             levels = c("Strongly disagree", 
                        "Somewhat disagree",
                        "Neither agree nor disagree",
                        "Somewhat agree",
                        "Strongly agree"))) %>% 
  ggplot() +
  aes(x = question, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(aes(label = paste0(round(value*100,0),"%")), 
            position = position_stack(vjust = .5),
            size = 3) +
    coord_flip() +
    ggforce::facet_col(facets = vars(question), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Factor 1") +
    theme(
     axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 10),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 20)
    )
data.frame(alpha2$response.freq) %>% 
  mutate(item = rownames(.)) %>% 
  left_join(item %>% 
              select(qname, question), 
            by = c("item" = "qname")) %>%
  pivot_longer(c(X1:X5)) %>% 
  mutate(Response = 
  factor(case_when(
    name == "X1" ~ "Strongly disagree",
  name == "X2" ~ "Somewhat disagree",
  name == "X3" ~ "Neither agree nor disagree",
  name == "X4" ~ "Somewhat agree",
  name == "X5" ~ "Strongly agree"),
  levels = c("Strongly disagree", 
             "Somewhat disagree",
             "Neither agree nor disagree",
             "Somewhat agree",
             "Strongly agree"))) %>% 
  ggplot() +
  aes(x = question, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(aes(label = paste0(round(value*100,0),"%")), 
            position = position_stack(vjust = .5),
            size = 3) +
    coord_flip() +
    ggforce::facet_col(facets = vars(question), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Factor 2") +
    theme(
     axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 10),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 20)
    )

Figure 5: Study 2 EFA, Factor 1 Response Distribution

Figure 6: Study 2 EFA, Factor 2 Response Distribution

Study 3

Study 3 was preregistered, which can be accessed at https://osf.io/q4uzv. In this study, we conducted confirmatory factor analysis using new data based on the factor structure in Study 2. The confirmatory factor analysis was fit using weighted least squares estimation treating categories as ordinal rather than numeric, which was a deviation from the preregistration. We will use Hu & Bentler’s (2009) cut offs of .95 for TLI and CFI, and .06 for RMSEA.

Participants

Through Prolific, we recruited 402 participants for Study 2. These participants were 52% men (n = 205), 70% White (n = 277), and most had completed a bachelor’s degree (41%, n = 161), and 88% reported no history of SUD diagnosis. Table 6 provides detail about the participants.

Code
df <- read.csv("data/clean/20231025_hr-scale-confirmatory-data2.csv")


demos <- df %>%
  select(age, gid1, race, ed, sud_hx)

demos %>%
  tbl_summary(
    missing = "no",
    statistic = list(age = "{mean} ({sd})"),
    digits = list(age ~ c(2,2)),
    sort = list(everything() ~ "frequency"),
    label = list(
      age ~ "Age", gid1 ~ "Gender Identity",
      race ~ "Race/Ethnicity", ed ~ "Highest Level of Education",
      sud_hx ~ "History of Substance Use Disorder Diagnosis")
              ) %>%
  as_gt() %>%
  tab_header(title = "Table 11: Study 3 Participants")
Table 11: Study 3 Participants
Characteristic N = 3931
Age 37.03 (11.59)
Gender Identity
    Man 205 (52%)
    Woman 174 (44%)
    Non-Binary, Agender, or Other 14 (3.6%)
Race/Ethnicity
    White 277 (70%)
    Asian or Asian American 41 (10%)
    Black or African American 35 (8.9%)
    Latine or Hispanic 28 (7.1%)
    Multiracial 12 (3.1%)
Highest Level of Education
    Bachelor’s Degree 161 (41%)
    High School 129 (33%)
    Associate’s Degree 46 (12%)
    Master’s Degree 34 (8.7%)
    Professional Degree (J.D., M.D., etc.) 9 (2.3%)
    Other option not represented here 5 (1.3%)
    Doctorate Degree (PhD, PsyD, etc.) 4 (1.0%)
    Less than High School 3 (0.8%)
History of Substance Use Disorder Diagnosis
    No 346 (88%)
    Yes 40 (10%)
    I don’t know 5 (1.3%)
    Prefer not to say 2 (0.5%)
1 Mean (SD); n (%)

Confirmatory Factor Analysis

The model developed in study 2 was fit to study 3’s data using confirmatory factor analysis with least square estimation and ordingal indicators. The model showed acceptablve fit on most indicators: \(\chi^2\) (103) = 314.68, p < .001; CFI = .98, TLI = .97; RMSEA = .07; SRMR = .19. Figure 7 provides factor loadings and covariances for the model.

Code
# Import data
df <- read.csv("data/clean/20231025_hr-scale-confirmatory-data2.csv")

# Select items
hr <- df %>% # N = 301
  select(starts_with("q"), -Q_RecaptchaScore)

# Study 2 Model Fit
model <-  'F1 =~ q7 + q1 + q6 + q18 + q19 + q21 + q24 + q14 + q13
           F2 =~ q33 + q42 + q36 + q35 + q38 + q34 + q41

           F1 ~~ F2' 

fit <- cfa(
  model,
  data = df,
  ordered = TRUE,
  estimator = "WLS",
  std.lv = TRUE) 

fitmeasures(fit,
            c("cfi", "tli", "srmr", "rmsea", "chisq", "df", "pvalue")) %>% 
  as_tibble(rownames = "Item") %>% 
  gt() %>% 
  tab_header(
    title = "Table 12: Study 3 CFA Fit Statistics"
  ) %>% 
  fmt_number(columns = 2, 
             decimals = 2)
Table 12: Study 3 CFA Fit Statistics
Item value
cfi 0.98
tli 0.97
srmr 0.18
rmsea 0.07
chisq 315.46
df 103.00
pvalue 0.00
Code
plot(set_cfa_layout(
  semPaths(fit, whatLabels="est",
        sizeMan = 3,
        node.width = 1,
        thresholds = FALSE,
        edge.label.cex = .75,
        style = "ram",
        mar = c(5, 1, 5, 1),
        intercepts = FALSE,
        DoNotPlot = TRUE),
  fcov_curve = 1.75,
  loading_position = .9))

Figure 7: Study 3 CFA

Scale Inspection

Code
meanscores <- df %>%
  mutate(across(c(q42, q35, q34, q41), 
                ~ 6 - .),
         Strategies = 
           rowMeans(df %>% select(all_of(colnames(scale1))), na.rm = TRUE),
         Principles = 
           rowMeans(df %>% select(all_of(colnames(scale2))), na.rm = TRUE))
Code
scale1 <- df %>% 
  select(q7, q1, q6, q24, q14, q18, q19, q13, q21)
scale2 <- df %>% 
  select(q42, q36, q33, q35, q38, q34, q41)


splithalf1 <- splitHalf(scale1) ; splithalf2 <- splitHalf(scale2)  
alpha1 <- psych::alpha(scale1, check.keys = TRUE) ; alpha2 <- psych::alpha(scale2, check.keys = TRUE) 
glb1 <- glb.algebraic(cor(scale1, use = "pairwise.complete.obs")) ; glb2 <- glb.algebraic(cor(scale2, use = "pairwise.complete.obs")) 

data.frame(Factor = c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2"),
           `Reliability Statistic` = c("Average Split Half", "Alpha", "Greatest Lower Bound",
                                       "Average Split Half", "Alpha", "Greatest Lower Bound"),
           Statistic = c(round(splithalf1$meanr,2),
                         round(alpha1$total$raw_alpha, 2), 
                         round(glb1$glb, 2),
                         round(splithalf2$meanr,2),
                         round(alpha2$total$raw_alpha, 2), 
                         round(glb2$glb, 2))) %>% 
  gt::gt(groupname_col = "Factor") %>% 
  gt::tab_header(title = "Table 10: Reliability Statistics For All Scales")
Table 10: Reliability Statistics For All Scales
Reliability.Statistic Statistic
Factor 1
Average Split Half 0.91
Alpha 0.92
Greatest Lower Bound 0.94
Factor 2
Average Split Half 0.86
Alpha 0.86
Greatest Lower Bound 0.46

The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it’s greatest lower bound is very low, which continues to be a concern. Overall, between the number of items, EFA fit statistics, and reliability, this model provides better fit and output compared to the initial model.

Table 11 provides detail about mean, median, and standard deviation for both subscales. The Strategies subscale may show some ceiling effect, as scores are higher above the mean, while the Principles subscale shows normal, but slightly skewed distribution. The two subscales show little to no relationship (r = -0.07, 95% CI -0.16, 0.03).

Code
gtExtras::gt_plt_summary(meanscores[60:61],
                         title = "Table 11: Mean Score for Both Factors")
Table 11: Mean Score for Both Factors
393 rows x 2 cols
Column Plot Overview Missing Mean Median SD
Strategies 15 0.0% 3.7 3.8 1.0
Principles 1.84.5 0.0% 2.7 2.7 0.4

Item Response Distribution

Both factors show a similar distribution to in the initial study. Some more extreme response options show under use, but across all items response options are used.

Code
items1 <-
data.frame(alpha1$response.freq) %>% 
  mutate(item = rownames(.)) %>% 
  left_join(items %>% 
              select(var_label, item), 
            by = c("item" = "var_label")) %>% 
  pivot_longer(c(X1:X5)) %>% 
  mutate(Response = 
           factor(case_when(
             name == "X1" ~ "Strongly disagree",
             name == "X2" ~ "Somewhat disagree",
             name == "X3" ~ "Neither agree nor disagree",
             name == "X4" ~ "Somewhat agree",
             name == "X5" ~ "Strongly agree"
           ),
             levels = c("Strongly agree","Somewhat agree",
                        "Neither agree nor disagree","Somewhat disagree",
                        "Strongly disagree"))) %>% 
  ggplot() +
  aes(x = item.y, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(aes(label = paste0(round(value*100,0),"%")), 
            position = position_stack(vjust = .5),
            size = 3) +
    ggokabeito::scale_fill_okabe_ito(
      breaks = c("Strongly disagree","Somewhat disagree",
                        "Neither agree nor disagree","Somewhat agree",
                        "Strongly agree"
      )) +
    coord_flip() +
    ggforce::facet_col(facets = vars(item.y), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Harm Reduction Strategies") +
    theme(
     axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 16),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 20)
    )

items2 <- 
data.frame(alpha2$response.freq) %>% 
  mutate(item = rownames(.)) %>% 
  left_join(items %>% 
              select(var_label, item), 
            by = c("item" = "var_label")) %>%
  pivot_longer(c(X1:X5)) %>% 
  mutate(Response = 
           factor(case_when(
             name == "X1" ~ "Strongly disagree",
             name == "X2" ~ "Somewhat disagree",
             name == "X3" ~ "Neither agree nor disagree",
             name == "X4" ~ "Somewhat agree",
             name == "X5" ~ "Strongly agree"
           ),
             levels = c("Strongly agree", "Somewhat agree", 
                 "Neither agree nor disagree", 
                 "Somewhat disagree", "Strongly disagree"))) %>% 
  ggplot() +
  aes(x = item.y, y = value, fill = Response) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(aes(label = paste0(round(value*100,0),"%")), 
            position = position_stack(vjust = .5),
            size = 3) +
  ggokabeito::scale_fill_okabe_ito(
      breaks = c("Strongly disagree","Somewhat disagree",
                        "Neither agree nor disagree","Somewhat agree",
                        "Strongly agree"
      )) +
    coord_flip() +
    ggforce::facet_col(facets = vars(item.y), 
                       scales = "free_y", 
                       space = "free",
                       labeller = label_wrap_gen(width = 100)
    ) +
    labs(title = "Harm Reduction Principles") +
    theme(
     axis.title = element_blank(),
      strip.background = element_blank(),
      legend.position = "top",
      axis.ticks = element_blank(),
      axis.text = element_blank(),
      panel.background = element_blank(),
      strip.text = element_text(size = 16),
      panel.spacing = unit(-.5, "lines"),
      legend.title = element_blank(),
      strip.clip = "off",
      plot.title = element_text(hjust = 0.5, size = 20)
    )

Finally, figure 10 shows mean score distributions for both scales in the current sample. As noted above, the Strategies subscale may show some ceiling effect, while the Principles subscale does not. While these two separate subscales need additional evidence, this factor structure for these 16 items demonstrates acceptable reliability and structural validity.

Code
(scores <- meanscores %>% 
  pivot_longer(Strategies:Principles) %>% 
  mutate(sud_hx = 
           case_when(
             sud_hx == "I don’t know" ~ "No",
             sud_hx == "Prefer not to say" ~ "No",
             TRUE ~ sud_hx
           )) %>% 
  group_by(
    name, 
    #race
    #sud_hx, 
    #gid1
    ) %>% 
  mutate(avg = mean(value, na.rm = TRUE)) %>% 
  ungroup() %>%
  ggplot() + aes(x = name, y = value, fill = name) +
  geom_violin() +
  geom_jitter(alpha = .3, height = 0.1) +
  geom_hline(aes(yintercept = avg, col = name),
             size = 2, alpha = .5) +
    labs(title = "Harm Reduction Subscale Scores")+
    ggokabeito::scale_fill_okabe_ito() +
    ggokabeito::scale_color_okabe_ito() +
  cowplot::theme_half_open() +
    theme(
      axis.title = element_blank(),
      legend.position = "none",
      axis.text.x = element_blank(),
      axis.line.y = element_line(),
      axis.line.y.right = element_line(),
      panel.background = element_blank(),
      plot.title = element_text(hjust = 0.5, size = 25),
      strip.clip = "off"
    )  +
  facet_wrap(~ name, scales = "free_x",
             strip.position = "bottom"))

Figure 10: Mean Score Distribution